Transitivity and Foregrounding in News Articles: Experiments in Information Retrieval and Automatic Summarising
نویسندگان
چکیده
This paper describes an ongoing study which applies the concept of transitivity to news discourse for text processing tasks. The complex notion of transitivity is defined and the relationship between transi-tivity and information foregrounding is explained. A sample corpus of news articles has been coded for transitivity. The corpus is being used in two text processing experiments .
منابع مشابه
News Image Annotation on a Large Parallel Text-image Corpus
In this paper, we present a multimodal parallel text-image corpus, and propose an image annotation method that exploits the textual information associated with images. Our corpus contains news articles composed of a text, images and image captions, and is significantly larger than the other news corpora proposed in image annotation papers (27,041 articles and 42,568 captionned images). In our e...
متن کاملUnderstanding news story chains using information retrieval and network clustering techniques
Content analysis of news stories (whether manual or automatic) is a cornerstone of the communication studies field. However, much research is conducted at the level of individual news articles, despite the fact that news events (especially significant ones) are frequently presented as “stories” by news outlets: chains of connected articles covering the same event from different angles. These st...
متن کاملArabic News Articles Classification Using Vectorized-Cosine Based on Seed Documents
Besides for its own merits, text classification (TC) has become a cornerstone in many applications. Work presented here is part of and a pre-requisite for a project we have overtaken to create a corpus for the Arabic text process. It is an attempt to create modules automatically that would help speed up the process of classification for any text categorization task. It also serves as a tool for...
متن کاملSemiautomatic Image Retrieval Using the High Level Semantic Labels
Content-based image retrieval and text-based image retrieval are two fundamental approaches in the field of image retrieval. The challenges related to each of these approaches, guide the researchers to use combining approaches and semi-automatic retrieval using the user interaction in the retrieval cycle. Hence, in this paper, an image retrieval system is introduced that provided two kind of qu...
متن کاملNews-Oriented Automatic Chinese Keyword Indexing
In our information era, keywords are very useful to information retrieval, text clustering and so on. News is always a domain attracting a large amount of attention. However, the majority of news articles come without keywords, and indexing them manually costs highly. Aiming at news articles’ characteristics and the resources available, this paper introduces a simple procedure to index keywords...
متن کامل